pacman:: p_load(sf, terra, spatstat, tmap, rvest, tidyverse)02-1: First-order Spatial Point Pattern Analysis
1. Overview
Spatial Point Pattern Analysis (SPPA) refers to the study of how points are arranged or distributed across a given surface. There points may represent:
Events, such as crimes, road accidents, or disease occurrences, or
Service and facility locations, including shops (e.g., cafes, supermarkets) and community facilities like childcare or aged care centres.
First-order Spatial Point Pattern Analysis (1-st SPPA) examines the intensity or density of points across a study area. It identifies spatial trends in point distribution without considering interaction between points helping to answer questions like:
Where are points most concentrated?
Is density uniform or variable?
How dispersed is the pattern?
In this exercise, the spatsat is utilised to apply two common 1st-SPPA methods to explore:
Whether childcare centres in Singapore are randomly distributed;
If not, identifying areas with higher concentrations of centres.
2. The Data
The following datasets were downloaded from publicly available websites, and both are available in KML and GeoJSON format.
| Dataset Name | Source | Discrption |
|---|---|---|
| Child Care Services | data.gov.sg | Point feature data: contains the locations and attributes of childcare centres. |
| Master plan 2019 Subzone Boundary (No Sea) | singstat | Polygon feature data: represents the URA 2019 Master Plan planning subzone boundaries. |
3. Installing and Loading the R Packages
In addition to spatstat, a total of five R packages will be used in this exercise.
| Package | Discription |
|---|---|
| sf | Simple Features, a new R package which handles importing, managing, and processing vector-based geospatial data. |
| spatstat | Provides useful functions for SPPA, which will be called to conduct both 1st and 2nd SPPA and KDE. |
| terra | Modern package for raster/vector spatial data and will be used to convert spastat outputs into terra format. |
| tmap | Creates high quality static or interactive choropleth maps via leaflet. |
| rvest | Scrapes and extracts data from web pages. |
After installation, we load them into R environment using the code below.
4. Importing and Wrangling Geospatial Data Sets
The following code chunk shows the steps to first import the Master plan 2019 Subzone Boundary (No Sea) data using st_read, extract the required 4 columns from the Description field, filter out the nearby islands, and finally save the file as mpsz_cl for further analysis.
mpsz_sf <- st_read("data/MasterPlan2019SubzoneBoundaryNoSeaKML.kml") %>%
st_zm(drop = TRUE, what = "ZM") %>%
st_transform(crs = 3414)Reading layer `URA_MP19_SUBZONE_NO_SEA_PL' from data source
`C:\lsrgc\ISSS626-yiqiong-pan\Hands-on_Ex\Hands-on_Ex02\data\MasterPlan2019SubzoneBoundaryNoSeaKML.kml'
using driver `KML'
Simple feature collection with 332 features and 2 fields
Geometry type: MULTIPOLYGON
Dimension: XY
Bounding box: xmin: 103.6057 ymin: 1.158699 xmax: 104.0885 ymax: 1.470775
Geodetic CRS: WGS 84
extract_kml_field <- function(html_text, field_name) {
if (is.na(html_text) || html_text == "") return(NA_character_)
page <- read_html(html_text)
rows <- page %>% html_elements("tr")
value <- rows %>%
keep(~ html_text2(html_element(.x, "th")) == field_name) %>%
html_element("td") %>%
html_text2()
if (length(value) == 0) NA_character_ else value
}# map_chr of purr (tidyverse) applies a function to each element of a list/vector and returns a character vector.
mpsz_sf <- mpsz_sf %>%
mutate(
REGION_N = map_chr(Description, extract_kml_field, "REGION_N"),
PLN_AREA_N = map_chr(Description, extract_kml_field, "PLN_AREA_N"),
SUBZONE_N = map_chr(Description, extract_kml_field, "SUBZONE_N"),
SUBZONE_C = map_chr(Description, extract_kml_field, "SUBZONE_C")
) %>%
select(-Name, -Description) %>%
relocate(geometry, .after = last_col())mpsz_cl <- mpsz_sf %>%
filter(SUBZONE_N != "SOUTHERN GROUP",
PLN_AREA_N != "WESTERN ISLANDS",
PLN_AREA_N != "NORTH-EASTERN ISLANDS")write_rds(mpsz_cl,
"data/mpsz_cl.rds")The code chuck below imports downloaded ChildCareServices data to R as sf data frame as childcare_sf by using st_read, coverts 3d to 2d (st_zm) and finally transform the CRS from WGS84 to SVY21.
childcare_sf <- st_read("data/ChildCareServices.geojson") %>%
st_zm(drop = TRUE, what = "ZM") %>% # Drop Z and M to convert from multi-dimensional to 2d (XY)
st_transform(crs = 3414)Reading layer `ChildCareServices' from data source
`C:\lsrgc\ISSS626-yiqiong-pan\Hands-on_Ex\Hands-on_Ex02\data\ChildCareServices.geojson'
using driver `GeoJSON'
Simple feature collection with 1925 features and 2 fields
Geometry type: POINT
Dimension: XYZ
Bounding box: xmin: 103.6878 ymin: 1.247759 xmax: 103.9897 ymax: 1.462134
z_range: zmin: 0 zmax: 0
Geodetic CRS: WGS 84
4.1 Mapping the Geospatial Data Sets
Using the tmap mapping methods, the code chunk below creates a map combining childcare_sf and mpsz_cl.
tm_shape(mpsz_cl)+
tm_fill(col = "grey" ) +
tm_borders(col = "black") +
tm_shape(childcare_sf) +
tm_dots(col = "black")
Alternatively, an interactive thematic map can be plotted using the code below. The interactive map is easy to navigate and query intuitively. It is optional to change the background map layer(choices: ESRI.WorldGrayCanvas(default), OpenStreetMap, ESRI.WorldTopoMap).
tmap_mode('view')
tm_shape(childcare_sf) +
tm_dots() #creates a layer of dots to visualise point data on a map.tmap_mode('plot') #switch back static mapsIt is advised to always switch back to plot mode to save connection consumption and limit the number of interactive maps to 10 in one documents when publishing.
5. Geospatial Data Wrangling
In this section, the data (sf objects) will be converted to spatstat data structure: ppp (for point data) and owin for observation window.
5.1 Converting sf Data Frames to PPP
Here we use as.ppp() of spatstat package to covert the point data childcare_sf to ppp file, confirm the change using class() and have a quick overview of the data statistics via summary().
childcare_ppp <- as.ppp(childcare_sf)
class(childcare_ppp)[1] "ppp"
summary(childcare_ppp)Marked planar point pattern: 1925 points
Average intensity 2.417323e-06 points per square unit
Coordinates are given to 11 decimal places
Mark variables: Name, Description
Summary:
Name Description
Length:1925 Length:1925
Class :character Class :character
Mode :character Mode :character
Window: rectangle = [11810.03, 45404.24] x [25596.33, 49300.88] units
(33590 x 23700 units)
Window area = 796335000 square units
Before moving forwards, let’s check if there are any duplicated points.
any(duplicated(childcare_ppp))[1] FALSE
5.2 Creating Owin Object
Similarly, the owin object can be created using the function as.owin() for polygon data. After the conversion, the class() and plot() functions can be used to verify that the object is of the correct class and that the data retains its original shape.
sg_owin <- as.owin(mpsz_cl)
class(sg_owin)[1] "owin"
plot(sg_owin)
5.3 Combining Point Events object and Owin Object
The code chunk below combines ppp and owin into one ppp file which means it updates the window of childcare_ppp to sg_owin and keeps the points that fall inside.
childcareSG_PPP = childcare_ppp[sg_owin]
class(childcareSG_PPP)[1] "ppp"
childcareSG_PPPMarked planar point pattern: 1925 points
Mark variables: Name, Description
window: polygonal boundary
enclosing rectangle: [2667.54, 55941.94] x [21448.47, 50256.33] units
why does plotting not work here?
summary(childcareSG_PPP)Marked planar point pattern: 1925 points
Average intensity 2.875208e-06 points per square unit
Coordinates are given to 11 decimal places
Mark variables: Name, Description
Summary:
Name Description
Length:1925 Length:1925
Class :character Class :character
Mode :character Mode :character
Window: polygonal boundary
41 separate polygons (26 holes)
vertices area relative.area
polygon 1 285 1.61128e+06 2.41e-03
polygon 2 27 1.50315e+04 2.25e-05
polygon 3 (hole) 41 -4.01660e+04 -6.00e-05
polygon 4 (hole) 317 -5.11280e+04 -7.64e-05
polygon 5 (hole) 3 -4.14100e-04 -6.19e-13
polygon 6 30 2.80002e+04 4.18e-05
polygon 7 (hole) 4 -2.86396e-01 -4.28e-10
polygon 8 (hole) 3 -1.81439e-04 -2.71e-13
polygon 9 (hole) 3 -5.99531e-04 -8.95e-13
polygon 10 (hole) 3 -3.04560e-04 -4.55e-13
polygon 11 (hole) 3 -4.46108e-04 -6.66e-13
polygon 12 (hole) 5 -2.44408e-04 -3.65e-13
polygon 13 (hole) 5 -3.64686e-02 -5.45e-11
polygon 14 71 8.18750e+03 1.22e-05
polygon 15 (hole) 38 -7.79904e+03 -1.16e-05
polygon 16 91 1.49663e+04 2.24e-05
polygon 17 (hole) 395 -7.38124e+03 -1.10e-05
polygon 18 40 1.38607e+04 2.07e-05
polygon 19 (hole) 11 -8.36705e+01 -1.25e-07
polygon 20 (hole) 3 -2.33435e-03 -3.49e-12
polygon 21 45 2.51218e+03 3.75e-06
polygon 22 139 3.22293e+03 4.81e-06
polygon 23 148 3.10395e+03 4.64e-06
polygon 24 (hole) 4 -1.72650e-04 -2.58e-13
polygon 25 75 1.73526e+04 2.59e-05
polygon 26 83 5.28920e+03 7.90e-06
polygon 27 106 3.04104e+03 4.54e-06
polygon 28 71 5.63061e+03 8.41e-06
polygon 29 10 1.99717e+02 2.98e-07
polygon 30 (hole) 3 -1.37223e-02 -2.05e-11
polygon 31 (hole) 3 -8.68789e-04 -1.30e-12
polygon 32 (hole) 3 -3.39815e-04 -5.08e-13
polygon 33 (hole) 3 -4.52041e-05 -6.75e-14
polygon 34 (hole) 3 -3.90173e-05 -5.83e-14
polygon 35 (hole) 3 -9.59845e-05 -1.43e-13
polygon 36 (hole) 8 -4.28707e-01 -6.40e-10
polygon 37 (hole) 4 -2.18619e-04 -3.27e-13
polygon 38 (hole) 6 -8.37554e-01 -1.25e-09
polygon 39 (hole) 5 -2.92235e-04 -4.36e-13
polygon 40 14053 6.67892e+08 9.98e-01
polygon 41 (hole) 3 -7.43616e-06 -1.11e-14
enclosing rectangle: [2667.54, 55941.94] x [21448.47, 50256.33] units
(53270 x 28810 units)
Window area = 669517000 square units
Fraction of frame area: 0.436
plot(childcareSG_PPP)
6. Clark-Evans Test for Nearest Neighbour Anaylysis (NNA)
Nearest Neighbor Analysis (NNA) is a general spatial analytical approach that examines the distribution of points by calculating the average nearest-neighbour distance to identify whether the pattern is clustered, random, or regular.
The Clark-Evans Test is a specific method within the NNA framework that uses the Clark-Evans aggregation index (R) to evaluate the spatial distribution of points.
The interpretation rule is:
𝑅< 1: clustered pattern
𝑅= 1: random pattern
𝑅> 1: regular pattern
In this study, the function clarkevans.test() from the spatstat package will be used. The hypotheses are defined as follows:
Null hypothesis (H0): Childcare services are randomly distributed.
A 95% confidence level will be applied for the test.
6.1 Performing the Clark-Evans Test without CSR (Monte Carlo)
clarkevans.test()has two modes without CSR and with CSR. Here we try out the mode without CSR. The alternative hypothesis (H1) here is the childcare services in Singapore are clustered.
Statistical Conclusion Since the Clark-Evans test shows an aggregation index R 0.535 which is less than 1 and p value is close to zero, we reject the null hypothesis of random pattern and conclude the childcare services in Singapore are clustered at 95% confidence level.
Business Communication The statistical results indicate that childcare centres in Singapore are unevenly distributed across the island. Given the critical role of childcare services in supporting social welfare and community stability, it is essential to zoom in and identify underserved areas and prioritise them in future planning.
clarkevans.test(childcareSG_PPP,
correction = "none",
clipregion = "sg_owin",
alternative= c("clustered")
)
Clark-Evans test
No edge correction
Z-test
data: childcareSG_PPP
R = 0.53532, p-value < 2.2e-16
alternative hypothesis: clustered (R < 1)
6.2 Performing the Clark-Evans Test with CSR (Monte Carlo?)
The code chunk below performs Clark-Evans Test with 99 times simulation.
Statistical Conclusion The Clark-Evans test shows an aggregation index R 0.535 and p value of 0.01. As a result we reject the null hypothesis of random pattern (CSR) and conclude the childcare services in Singapore are more clustered at 95% confidence level.
Business Communication The statistical results based on 99 Monte Carlo simulations confirm that childcare centres in Singapore are not evenly distributed. It is therefore essential to identify underserved areas and prioritise them in future planning. Policymakers may consider providing incentives to attract service providers to establish new centres in these regions or, in the interim, offering subsidies to local families to ensure a smoother transition and equitable access to childcare services.
clarkevans.test(childcareSG_PPP,
correction ="none",
clipregion ="sg_owin",
alternative= "less",
method = "MonteCarlo",
nsim = 99)
Clark-Evans test
No edge correction
Monte Carlo test based on 99 simulations of CSR with fixed n
data: childcareSG_PPP
R = 0.53532, p-value = 0.01
alternative hypothesis: clustered (R < 1)
7. Kernel Density Estimation Method
Kernel Density Estimation (KDE) creates a continuous surface from point data, allowing us to visualise clusters, hotspots, and spatial variation.It provides an initial understanding of childcare service distribution in Singapore.
7.1 Working with Automatic Bandwidth Selection Method
The density() function is used to compute a KDE of the childcareSG_PPP point pattern and smooth these point locations into a continuous intensity surface. sigma is the bandwidth (smoothing parameter): a smaller sigma produces a spikier output, while a larger sigma smooths the surface. Bandwidth selectors include bw.diggle, bw.CvL, bw.scott, and bw.ppl. Enabling edge correction helps avoid border bias. The kernel function defines the shape of the smoothing: “gaussian” is the smoothest, while alternatives such as “epanechnikov”, “quartic”, or “disc” produce sharper surfaces.
kde_SG_diggle <- density (
childcareSG_PPP,
sigma = bw.diggle,
edge = TRUE,
kernel = "gaussian")The density() function in spatstat returns an im object, a pixel image that stores raster data as a grid of values. Each pixel represents intensity at that location, and the result can be visualised using plot() and summary() for a quick overview.
plot(kde_SG_diggle)
summary(kde_SG_diggle)real-valued pixel image
128 x 128 pixel array (ny, nx)
enclosing rectangle: [2667.538, 55941.94] x [21448.47, 50256.33] units
dimensions of each pixel: 416 x 225.0614 units
Image is defined on a subset of the rectangular grid
Subset area = 669941961.12249 square units
Subset area fraction = 0.437
Pixel values (inside window):
range = [-6.584123e-21, 3.063698e-05]
integral = 1927.788
mean = 2.877545e-06
We can also check the bandwidth using bw.diggle().
bw <- bw.diggle(childcareSG_PPP)
bw sigma
295.9712
7.2 Rescalling KDE Values
As seen in the summary statistic of kde_SG_diggle, the KDE output ranges from 0 to 0.000037 which is hard to connect with real world interpretation. Since this is due to the unit in SVY21 is metre, here we convert the unit from metre to kilometre using rescale.ppp().
childcareSG_ppp_km <- rescale.ppp(
childcareSG_PPP, 1000, "km")
summary(childcareSG_ppp_km)Marked planar point pattern: 1925 points
Average intensity 2.875208 points per square km
Coordinates are given to 14 decimal places
Mark variables: Name, Description
Summary:
Name Description
Length:1925 Length:1925
Class :character Class :character
Mode :character Mode :character
Window: polygonal boundary
41 separate polygons (26 holes)
vertices area relative.area
polygon 1 285 1.61128e+00 2.41e-03
polygon 2 27 1.50315e-02 2.25e-05
polygon 3 (hole) 41 -4.01660e-02 -6.00e-05
polygon 4 (hole) 317 -5.11280e-02 -7.64e-05
polygon 5 (hole) 3 -4.14100e-10 -6.19e-13
polygon 6 30 2.80002e-02 4.18e-05
polygon 7 (hole) 4 -2.86396e-07 -4.28e-10
polygon 8 (hole) 3 -1.81440e-10 -2.71e-13
polygon 9 (hole) 3 -5.99531e-10 -8.95e-13
polygon 10 (hole) 3 -3.04560e-10 -4.55e-13
polygon 11 (hole) 3 -4.46108e-10 -6.66e-13
polygon 12 (hole) 5 -2.44408e-10 -3.65e-13
polygon 13 (hole) 5 -3.64686e-08 -5.45e-11
polygon 14 71 8.18750e-03 1.22e-05
polygon 15 (hole) 38 -7.79904e-03 -1.16e-05
polygon 16 91 1.49663e-02 2.24e-05
polygon 17 (hole) 395 -7.38124e-03 -1.10e-05
polygon 18 40 1.38607e-02 2.07e-05
polygon 19 (hole) 11 -8.36705e-05 -1.25e-07
polygon 20 (hole) 3 -2.33435e-09 -3.49e-12
polygon 21 45 2.51218e-03 3.75e-06
polygon 22 139 3.22293e-03 4.81e-06
polygon 23 148 3.10395e-03 4.64e-06
polygon 24 (hole) 4 -1.72650e-10 -2.58e-13
polygon 25 75 1.73526e-02 2.59e-05
polygon 26 83 5.28920e-03 7.90e-06
polygon 27 106 3.04104e-03 4.54e-06
polygon 28 71 5.63061e-03 8.41e-06
polygon 29 10 1.99717e-04 2.98e-07
polygon 30 (hole) 3 -1.37223e-08 -2.05e-11
polygon 31 (hole) 3 -8.68789e-10 -1.30e-12
polygon 32 (hole) 3 -3.39815e-10 -5.08e-13
polygon 33 (hole) 3 -4.52041e-11 -6.75e-14
polygon 34 (hole) 3 -3.90173e-11 -5.83e-14
polygon 35 (hole) 3 -9.59845e-11 -1.43e-13
polygon 36 (hole) 8 -4.28707e-07 -6.40e-10
polygon 37 (hole) 4 -2.18619e-10 -3.27e-13
polygon 38 (hole) 6 -8.37554e-07 -1.25e-09
polygon 39 (hole) 5 -2.92235e-10 -4.36e-13
polygon 40 14053 6.67892e+02 9.98e-01
polygon 41 (hole) 3 -7.43616e-12 -1.11e-14
enclosing rectangle: [2.66754, 55.94194] x [21.44847, 50.25633] km
(53.27 x 28.81 km)
Window area = 669.517 square km
Unit of length: 1 km
Fraction of frame area: 0.436
The code chunk below computes KDE in km, summrise and plot KDE. Now the legend confirms the change of unit measurement.The KDE results show that certain areas (in yellow) have an estimated intensity of more than 30 childcare centres per square kilometre.
kde_childcareSG_km <- density(childcareSG_ppp_km,
sigma = bw.diggle,
edge = TRUE,
kernel = "gaussian")
summary(kde_childcareSG_km)real-valued pixel image
128 x 128 pixel array (ny, nx)
enclosing rectangle: [2.667538, 55.94194] x [21.44847, 50.25633] km
dimensions of each pixel: 0.416 x 0.2250614 km
Image is defined on a subset of the rectangular grid
Subset area = 669.941961122495 square km
Subset area fraction = 0.437
Pixel values (inside window):
range = [-5.824417e-15, 30.63698]
integral = 1927.788
mean = 2.877545
plot(kde_childcareSG_km)
7.3 Working with Different Automatic Bandwidth Methods
As mentioned before, there are four automatic bandwidth methods. Let’s compare and see the difference for one ppp data.
bw.CvL(childcareSG_ppp_km) sigma
4.357209
bw.scott(childcareSG_ppp_km) sigma.x sigma.y
2.159749 1.396455
bw.ppl(childcareSG_ppp_km) sigma
0.378997
bw.diggle(childcareSG_ppp_km) sigma
0.2959712
Baddeley et al. (2016) suggest using bw.ppl() for patterns with many tight clusters, while bw.diggle() is better for detecting a single tight cluster in random noise. The following code compares KDE results using both methods.The KDE results from bw.ppl on the right show that certain areas (in yellow) have an estimated intensity of more than 25 childcare centres per square kilometre. The PPL bandwidth (0.379) is close to Diggle’s (0.296) but avoids excessive spikiness, while being much smaller than CvL (4.357), which produces an overly smoothed surface.
kde_childcareSG.ppl <- density(childcareSG_ppp_km,
sigma = bw.ppl,
edge = TRUE,
kernel = "gaussian")
summary(kde_childcareSG.ppl)real-valued pixel image
128 x 128 pixel array (ny, nx)
enclosing rectangle: [2.667538, 55.94194] x [21.44847, 50.25633] km
dimensions of each pixel: 0.416 x 0.2250614 km
Image is defined on a subset of the rectangular grid
Subset area = 669.941961122495 square km
Subset area fraction = 0.437
Pixel values (inside window):
range = [-4.56929e-15, 25.04596]
integral = 1929.932
mean = 2.880745
par(mfrow = c(2,2))#shows two plots side by side (1 row, 2 columns).
plot(kde_childcareSG_km, main = "bw.diggle") #main is the title of the plot
plot(kde_childcareSG.ppl, main = "bw.ppl")
plot(density(childcareSG_ppp_km,
sigma = bw.scott,
edge = TRUE,
kernel = "gaussian"),
main = "bw.sscott")
plot(density(childcareSG_ppp_km,
sigma = bw.CvL,
edge = TRUE,
kernel = "gaussian"),
main = "bw.CvL") 
7.4 Working with Different kernel Methods
Next we compare difference between kernel methods (Gaussian, Epanechnikov, Quartic and Disc) through the charts.
par(mfrow = c(2,2)) # par= plot appearance settings, mfrow= multi-frame by row e.g. 2 * 2.
plot(density(childcareSG_ppp_km,
sigma = 0.2959712,
edge = TRUE,
kernel = "gaussian"),
main = "Gaussian")
plot(density(childcareSG_ppp_km,
sigma = 0.2959712,
edge = TRUE,
kernel = "epanechnikov"),
main = "Epanechnikov")
plot(density(childcareSG_ppp_km,
sigma = 0.2959712,
edge = TRUE,
kernel = "quartic"),
main = "Quartic")
plot(density(childcareSG_ppp_km,
sigma = 0.2959712,
edge = TRUE,
kernel = "disc"),
main = "Disc")
When sigma is same, the KDEs look alike between kernel methods, thus to plot a reasonable KDE map, bandwidth choice matters more.
8. Fixed and Adaptive KDE
8.1 Computing KDE by Using Fixed Bandwidth
In the section, we define the bandwidth 600 metres or 0.6 km for the KDE layer.
kde_childcare_fb <- density(childcareSG_ppp_km,
sigma = 0.6,
edge = TRUE,
kernel = "gaussian")
plot(kde_childcare_fb)
8.2 Computing KDE by Using Adaptive Bandwidth
density.adaptive() generates adaptive KDE and helps us to overcome problems such as highly skewed distribution e.g., rural vs urban.
kde_childcareSG_ab <- adaptive.density(
childcareSG_ppp_km,
method = "kernel")
plot(kde_childcareSG_ab)
We can compare both side by side.Adaptive KDE reduces visual contrast in hotspots because it balances smoothing between dense and sparse areas, whereas fixed bandwidth highlights dense clusters more strongly.
par(mfrow = c(1,2))
plot(kde_childcare_fb, main = "Fixed bandwidth")
plot(kde_childcareSG_ab, main = "Adaptive bandwidth")
9. Plotting Cartopraphic Quality KDE Map
9.1 Converting Gridded Output into Raster
The chunk below uses rast() from terra package to covert im kernel density object into SpatRaster object which is verified by the class().
kde_childcareSG_bw_terra <- rast(kde_childcareSG_km)
class(kde_childcareSG_bw_terra)[1] "SpatRaster"
attr(,"package")
[1] "terra"
kde_childcareSG_bw_terraclass : SpatRaster
size : 128, 128, 1 (nrow, ncol, nlyr)
resolution : 0.4162063, 0.2250614 (x, y)
extent : 2.667538, 55.94194, 21.44847, 50.25633 (xmin, xmax, ymin, ymax)
coord. ref. :
source(s) : memory
name : lyr.1
min value : -5.824417e-15
max value : 3.063698e+01
unit : km
Here the CSR property (coord.ref.) should be SVY21 but is empty as per properties displayed above.
9.2 Assigning Propjection Systems
The code chunk below assign the correct EPSG 3414 to the SpatRaster object.
crs(kde_childcareSG_bw_terra) <- "EPSG:3414"
kde_childcareSG_bw_terraclass : SpatRaster
size : 128, 128, 1 (nrow, ncol, nlyr)
resolution : 0.4162063, 0.2250614 (x, y)
extent : 2.667538, 55.94194, 21.44847, 50.25633 (xmin, xmax, ymin, ymax)
coord. ref. : SVY21 / Singapore TM (EPSG:3414)
source(s) : memory
name : lyr.1
min value : -5.824417e-15
max value : 3.063698e+01
unit : km
It seems that when working with spatial data, some values/attributes may be missing initially or may become missing during data conversion. Therefore, it is important to always do sanity check first.
Before running spatial analysis: simply typing the object name will print its metadata: dimensions, resolution, CRS, range of values, etc.
9.3 Plotting KDE Map with tmap
Let’s plot the KDE map.
tm_shape(kde_childcareSG_bw_terra) + # load the data source for the layers that follow
tm_raster(col.scale =
tm_scale_continuous( # uses a continuous colour scale.
values = "viridis"),
col.legend = tm_legend( # customises the legend for the raster:
title = "Density values",
title.size = 0.7,
text.size = 0.7,
bg.color = "white",
bg.alpha = 0.7,
position = tm_pos_in(
"right", "bottom"),
frame = TRUE)) +
tm_graticules(labels.size = 0.7) + # adds latitude/longitude grid lines
tm_compass() + # adds a compass
tm_layout(scale = 1.0) # keeps default size
layer.1 is the default name for the single KDE band. It holds the density values, one for each pixel in the raster grid.?
10. First Order SPPA at the Planning Subzone Level
We focus on the childcare centres in the four areas: Punggol, Tampines, Choa Chu Kang and Jurong West.
10.1 Geospatial Data Wrangling
10.1.1 Extracting the Study Area
The code chunk uses filter() to create a new variable for each area and plot() for quick preview.
pg <- mpsz_cl %>%
filter(PLN_AREA_N == "PUNGGOL")
tm <- mpsz_cl %>%
filter(PLN_AREA_N == "TAMPINES")
ck <- mpsz_cl %>%
filter(PLN_AREA_N == "CHOA CHU KANG")
jw <- mpsz_cl %>%
filter(PLN_AREA_N == "JURONG WEST")par(mfrow=c(2,2))
plot(st_geometry(pg), main = "Ponggol")
plot(st_geometry(tm), main = "Tampines")
plot(st_geometry(ck), main = "Choa Chu Kang")
plot(st_geometry(jw), main = "Jurong West")
10.1.2 Creating owin Object Using as.owin()
pg_owin = as.owin(pg)
tm_owin = as.owin(tm)
ck_owin = as.owin(ck)
jw_owin = as.owin(jw)10.1.3 Combining Point Event Object and Owin Object
The code chunk below subsets the dataset to the study areas, rescales the unit of measurement from metre to kilometre and finally plot the areas with childcare points.
childcare_pg_ppp = childcare_ppp[pg_owin] #crop childcare points to Punggol
childcare_tm_ppp = childcare_ppp[tm_owin]
childcare_ck_ppp = childcare_ppp[ck_owin]
childcare_jw_ppp = childcare_ppp[jw_owin]childcare_pg_ppp.km = rescale.ppp(childcare_pg_ppp, 1000, "km")
childcare_tm_ppp.km = rescale.ppp(childcare_tm_ppp, 1000, "km")
childcare_ck_ppp.km = rescale.ppp(childcare_ck_ppp, 1000, "km")
childcare_jw_ppp.km = rescale.ppp(childcare_jw_ppp, 1000, "km")par(mfrow=c(2,2))
plot(unmark(childcare_pg_ppp.km),
main = "Punggol")
plot(unmark(childcare_tm_ppp.km),
main = "Tampines")
plot(unmark(childcare_ck_ppp.km),
main = "Choa Chu Kang")
plot(unmark(childcare_jw_ppp.km),
main = "Jurong West")
10.2 Clark-Evans Test
A 95% confidence level will be applied for the test.
Null Hypothesis: Childcare services are randomly distributed.
Alternative Hypothesis: Childcare services are mot randomly distributed.
clarkevans.test(childcare_ck_ppp,
correction ="none",
clipregion = NULL,
alternative = c("two.sided"),
nsim = 99)
Clark-Evans test
No edge correction
Z-test
data: childcare_ck_ppp
R = 0.84097, p-value = 0.008866
alternative hypothesis: two-sided
clarkevans.test(childcare_tm_ppp,
correction ="none",
clipregion = NULL,
alternative = c("two.sided"),
nsim = 99)
Clark-Evans test
No edge correction
Z-test
data: childcare_tm_ppp
R = 0.66817, p-value = 6.58e-12
alternative hypothesis: two-sided
clarkevans.test(childcare_ck_ppp,
correction ="none",
clipregion = NULL,
alternative = c("two.sided"),
nsim = 99)
Clark-Evans test
No edge correction
Z-test
data: childcare_ck_ppp
R = 0.84097, p-value = 0.008866
alternative hypothesis: two-sided
clarkevans.test(childcare_jw_ppp,
correction ="none",
clipregion = NULL,
alternative = c("two.sided"),
nsim = 99)
Clark-Evans test
No edge correction
Z-test
data: childcare_jw_ppp
R = 0.68968, p-value = 4.772e-10
alternative hypothesis: two-sided
Statistical Conclusion Based on the p-values for each study area, we reject all null hypotheses of complete spatial randomness. This indicates that childcare centres are significantly clustered rather than randomly distributed.
Business Communication While the analysis confirms that childcare centres are spatially clustered, this alone does not establish that the clusters coincide with residential hubs. Some residential complexes may lack nearby childcare facilities, while other areas may be overserved. To verify this relationship, further analysis with additional datasets (e.g., population density and housing distribution) is required.
10.3 Computing KDE Surfaces By Planning Area
par(mfrow=c(2,2))
plot(density(childcare_pg_ppp.km,
sigma = bw.diggle,
edge = TRUE,
kernel = "gaussian"),
main = "Punggol")
plot(density(childcare_tm_ppp.km,
sigma = bw.diggle,
edge = TRUE,
kernel = "gaussian"),
main = "Tempines")
plot(density(childcare_ck_ppp.km,
sigma = bw.diggle,
edge = TRUE,
kernel = "gaussian"),
main = "Choa Chu Kang")
plot(density(childcare_jw_ppp.km,
sigma = bw.diggle,
edge = TRUE,
kernel = "gaussian"),
main = "Jurong West")
par(mfrow=c(2,2))
plot(density(childcare_pg_ppp.km,
sigma = bw.ppl,
edge = TRUE,
kernel = "gaussian"),
main = "Punggol")
plot(density(childcare_tm_ppp.km,
sigma = bw.ppl,
edge = TRUE,
kernel = "gaussian"),
main = "Tempines")
plot(density(childcare_ck_ppp.km,
sigma = bw.ppl,
edge = TRUE,
kernel = "gaussian"),
main = "Choa Chu Kang")
plot(density(childcare_jw_ppp.km,
sigma = bw.ppl,
edge = TRUE,
kernel = "gaussian"),
main = "Jurong West")
par(mfrow=c(2,2))
plot(adaptive.density(childcare_pg_ppp.km,
method = "kernel"),
main = "Punggol")
plot(adaptive.density(childcare_tm_ppp.km,
method = "kernel"),
main = "Tempines")
plot(adaptive.density(childcare_ck_ppp.km,
method = "kernel"),
main = "Choa Chu Kang")
plot(adaptive.density(childcare_jw_ppp.km,
method = "kernel"),
main = "Jurong West")
KDE results show that childcare distribution patterns differ across planning areas, but none are evenly distributed. At this stage, we cannot confirm whether the clusters align with residential areas. Further analysis with population and housing data is needed to assess whether the current supply matches local demand to identify under-served areas.
##11 Reference Kam, T. S. 1st Order Spatial Point Patterns Analysis Methods. R for Geospatial Data Science and Analytics. https://r4gdsa.netlify.app/chap04